perm filename KEYBOA.TEX[W81,JMC] blob sn#877496 filedate 1989-09-20 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00003 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	\input memo.tex[let,jmc]
C00016 00003	\smallskip\centerline{Copyright \copyright\ 1989\ by John McCarthy}
C00017 ENDMK
CāŠ—;
\input memo.tex[let,jmc]
\title{EFFECTIVE INTERACTIVE USE OF LARGE CHARACTER SETS}

	The interactive use of large character sets needs to be
distinguished from the somewhat simpler task of making them
available for the production of printed documents.

	The SUN terminal, or an inexpensive version thereof that can
replace Datadiscs, will provide the ability to display arbitrary
characters - even variable width.  Moreover, we are now planning for
new computer equipment that might, for example, replace SAIL and
SCORE by something more powerful.  Nevertheless, unless we plan
some standards for using enlarged character sets and make the
corresponding software changes, we will lose even the mildly
enlarged character set we have on SAIL - a prospect many who
make use of the extra characters will regard with dismay.

	Here is the result we should strive for; later we'll
consider how it might be achieved.

	1. Arbitrary characters can inhabit text files.  When
the files are being edited or read, the characters are visible.
When the files are printed, the characters appear on paper.
We are not referring here to arbitrary fonts or arbitrary sizes.
Still less do we mean an on-line TEX or PUB.  The extra characters
are treated like the extra characters in the SAIL set, but
arbitrary character sets may be used.

	2. When people at other computers read or print our
files they also get all the characters.  Of course, this will
require a certain amount of standardization.  Preferably, we
will standardize a character description language rather than
trying to agree on a standard set of characters.

	3. When a file is being edited, the new characters are treated
just like ordinary characters as are the enlargements to the
SAIL character set.  This should be made to work even when the
characters are peculiar to a given user.

	4. In the best of all possible worlds, the key tops would
have LCD displays, and there would be several shift keys like the
TOP key on the SAIL keyboards.  It seems very unlikely that we
will be able to achieve this.

	A more likely possibility is to use small keys with space above
them for a legend giving the additional characters in the CSD
Standard set which can be covered by a plastic or paper overlay
when a different set is being used.  It has been suggested that
the overlay plastic have a flange that projects down into a slot
where it can be machine read, but it isn't likely that this will
be available in the forseeable future either.

	There can be one or more TOP keys as on the SAIL keyboard
and many calculator keyboard.  Alterna\-tive\-ly, one can use ``escape''
keys typed before the key whose interpretation is to be modified.
A third alternative is to use a standard keyboard and concoct
the modifications out of the control key and whatever else may
be available.  The third alternative should certainly be available,
because, for various reasons cheap keyboards, sometimes have to be used.
There is some consensus that extra TOP keys are preferable to
prefixed keys, because the latter puts the editing
process into an intermediate state after the prefix key has
been typed and before the other key has been typed.  Such
states lead to errors.

	The proposed facilities should be distinguished from
the ability to include arbitrary
characters in documents printed by PUB, TEX, etc. and also
from proposals to make such languages more interactive by
using good displays to allow the user of such a language
to see and edit directly the finished form of the document
he is producing.  Our proposals are entirely independent of
formatting languages.

	Specifically, we propose to provide the user with
the ability to interact directly with programs in arbitrary
character sets.  No special formatting takes place.  We are
merely providing the user with the ability to use
enlarged sets of symbols.

	This system does not provide all the output flexibility
of such a formatting language.  Specifically, control over character
size and font would not be offered as a general system facility.
This is for two reasons.  First, providing the symbols in standard
sizes seems like a difficult enough task.  Second, it isn't clear
that control over size and font would be worth the costs to the
user in learning how to use the system.  Of course, programs
concerned with formatted output could use the display facilities
to give the user interactive control over these parameters of
the characters.

	Programs that interact with a user could use arbitrary
fonts and sizes for output and could readily switch fonts for
input so as to distinguish user input from program output.

\centerline{\bf IMPLEMENTATION CONSIDERATIONS}

	Here are some ideas for implementation.

	1. Editors like E and EMACS can be modified to represent
special characters by to or three byte (7 bits per byte is what
they now use) strings.  Commands that space through a file must
space over the strings representing single characters.  If the
characters are of variable width, the editors will have to know
about that if they have JUSTIFY and FILL commands.

	2. As the SAIL keyboards do now, the keyboards will send
strings of seven bit characters to the computer.  Therefore, the
symbol will not be identified by the key that was typed.  It will
have to be established by the program with which the user is
interacting, from a user INIT file, by initial convention, or
by a system command.

	3. Text files will need to keep information in directories
about the character set being used.  Only this will permit them
to be displayed at remote sites or even by other users of the
same system.

	4. The display system must be able to maintain a different
large character set for each of its users.  With the large address
space of the M68000 and the use of 64K rams for its memory, this
should not be a problem.  Of course, the user may put additional
burdens on this memory by switching among several character sets
using facilities analogous to SAIL's <break>R.

	5. The basic form of a character must be a drawing
made of curves rather than a dot image, since it must be
displayable and printable on a variety of devices with
different resolutions, i.e. Metafont or something like it
must be the basic form.  The support for display
and printing devices must include programs for converting
fonts from standard form to forms suitable for the device.

	6. There arise the administrative and technical
problems of standardizing and registering characters and
character sets.  This is in addition to standardizing
metafont or other means of defining character shapes.
One could imagine a national registry of characters to
which the inventor of a new character or set could send
a design and from which he would receive a registration
number.  This number could then be included in the prefix
of a file using the character or characters.  Alternatively,
the file could refer to a local registry of characters or
even to one of the user's own character design files.
The preamble of the file could itself contain the designs
for exotic characters, though that might make them rather
long.  All these systems should co-exist, of course.

	We can suppose that commercial publishers
and organizations that publish
journals like ACM, IEEE and the American Mathematical Society
would keep their own registries of characters.  It would
be important that these registries be network or Dialnet accessible to
people who prepare manuscripts for them to publish.

	7. Most likely, Stanford will have to act before
any standardization committee can be formed and do its
job.  Therefore, we should undertake to make our work
as standardizable as possible, and this includes publishing
what we are up to.  However, it wouldn't hurt to ask
whether M.I.T. and C.M.U. and Xerox and maybe even IBM
are interested in talking about character standardization.
The costs of doing a good job may be large enough to warrant
an externally supported project.

	8. It seems that the problem of defining characters
apart from fonts interacts in various unpleasant ways
with the problem of font definition.  Perhaps there needs
to be a standard (say Times Roman like) style for defining
new characters.

	9. This draft may not take into account sufficiently
work already done in standardizing characters for
publishing purposes, but that work probably doesn't take into account
the requirements of the interactive use of large character
sets.

The TEX source file for this document is KEYBOA.TEX[W81,JMC] at SU-AI.

\smallskip\centerline{Copyright \copyright\ 1989\ by John McCarthy}
\smallskip\noindent{This draft of KEYBOA.TEX[W81,JMC]\ TEXed on \jmcdate\ at \theTime}
%File originated on 20-Sep-89
\vfill\eject\end